AITopics

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.59)
Information Technology > Artificial Intelligence > Natural Language (0.47)

Neural Information Processing SystemsFeb-9-2026, 08:34:19 GMT

20fdaf67581e6d7157376d1ed584040a-Paper-Conference.pdf

edge pruning, experiment, sparsity, (13 more...)

Country:

South America > Colombia > Meta Department > Villavicencio (0.04)
Europe > Latvia > Lubāna Municipality > Lubāna (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Vision (0.68)

arXiv.org Artificial IntelligenceOct-28-2025

PAHQ: Accelerating Automated Circuit Discovery through Mixed-Precision Inference Optimization

Wang, Xinhai, Yang, Shu, Wang, Liangyu, Zhang, Lin, Xie, Huanyi, Hu, Lijie, Wang, Di

Circuit discovery, which involves identifying sparse and task-relevant subnetworks in pre-trained language models, is a cornerstone of mechanistic interpretability. Automated Circuit Discovery (ACDC) has emerged as a pivotal methodology in circuit discovery, but its application to large language models is severely limited by computational inefficiency and prohibitively high memory requirements. Although several accelerated approaches have been proposed, they primarily rely on linear approximations to ACDC, which significantly compromises analytical faithfulness. Our proposed method for accelerating automated circuit discovery, Per Attention Head Quantization (PAHQ), takes a fundamentally different approach by optimizing the efficiency of each individual patching operation. PAHQ leverages a fundamental alignment between activation patching and mixed-precision quantization (MPQ): interpretability analysis through patching essentially performs targeted ablation studies. Therefore, we can maintain high precision exclusively for investigated components while safely reducing precision elsewhere in the network. PAHQ-accelerated ACDC reduces runtime by up to 80\% and memory consumption by up to 30\% compared to unaccelerated ACDC while maintaining faithfulness. Importantly, our method readily integrates with existing edge-based circuit discovery techniques by modifying the attention computation mechanism. This training-free approach provides a practical and novel pathway for accelerating mechanistic interpretability methods. Our code is available at https://github.com/626619403/PAHQ.

large language model, machine learning, quantization, (18 more...)

2510.23264

Genre: Research Report (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)

Neural Information Processing SystemsOct-9-2025, 20:45:37 GMT

Finding Transformer Circuits with Edge Pruning

Recent work has automated the task of discovering circuits.

edge pruning, experiment, sparsity, (13 more...)

Country:

South America > Colombia > Meta Department > Villavicencio (0.04)
Europe > Latvia > Lubāna Municipality > Lubāna (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Vision (0.68)

arXiv.org Artificial IntelligenceOct-9-2025

BlackboxNLP-2025 MIB Shared Task: Exploring Ensemble Strategies for Circuit Localization Methods

Mondorf, Philipp, Wang, Mingyang, Gerstner, Sebastian, Hakimi, Ahmad Dawar, Liu, Yihong, Veloso, Leonor, Zhou, Shijia, Schütze, Hinrich, Plank, Barbara

The Circuit Localization track of the Mechanistic Interpretability Benchmark (MIB) evaluates methods for localizing circuits within large language models (LLMs), i.e., subnetworks responsible for specific task behaviors. In this work, we investigate whether ensembling two or more circuit localization methods can improve performance. We explore two variants: parallel and sequential ensembling. In parallel ensembling, we combine attribution scores assigned to each edge by different methods-e.g., by averaging or taking the minimum or maximum value. In the sequential ensemble, we use edge attribution scores obtained via EAP-IG as a warm start for a more expensive but more precise circuit identification method, namely edge pruning. We observe that both approaches yield notable gains on the benchmark metrics, leading to a more precise circuit identification approach. Finally, we find that taking a parallel ensemble over various methods, including the sequential ensemble, achieves the best results. We evaluate our approach in the BlackboxNLP 2025 MIB Shared Task, comparing ensemble scores to official baselines across multiple model-task combinations.

edge pruning, large language model, machine learning, (17 more...)

2510.06811

Country: North America > United States (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.87)

Neural Information Processing SystemsOct-3-2025, 03:36:16 GMT

all of the components ", has an interesting "idea of stabilizing training ", and "achieves state-of-the-art performance. "

We thank the reviewers for their time and valuable feedback. Below, we clarify several important points raised by the reviewers. An extra page in the final version will allow us to include the requested details. We believe these clarifications, together with new analyses, resolve all key issues raised. Rep'16] and provides a highly constraining measure of local topology.

artificial intelligence, gnng uard, graph, (17 more...)

Genre: Research Report > New Finding (0.31)

Technology: Information Technology > Artificial Intelligence (0.32)

Neural Information Processing SystemsMay-26-2025, 18:32:38 GMT

Finding Transformer Circuits With Edge Pruning

The path to interpreting a language model often proceeds via analysis of circuits---sparse computational subgraphs of the model that capture specific aspects of its behavior. Recent work has automated the task of discovering circuits. Yet, these methods have practical limitations, as they either rely on inefficient search algorithms or inaccurate approximations. In this paper, we frame circuit discovery as an optimization problem and propose Edge Pruning as an effective and scalable solution. Our method finds circuits in GPT-2 that use less than half the number of edges than circuits found by previous methods while being equally faithful to the full model predictions on standard circuit-finding tasks.

edge pruning, large language model, machine learning, (7 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.61)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.31)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.31)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

arXiv.org Artificial IntelligenceFeb-23-2025

Spectral Theory for Edge Pruning in Asynchronous Recurrent Graph Neural Networks

Bessone, Nicolas

Graph Neural Networks (GNNs) have emerged as a powerful tool for learning on graph-structured data, finding applications in numerous domains including social network analysis and molecular biology. Within this broad category, Asynchronous Recurrent Graph Neural Networks (ARGNNs) stand out for their ability to capture complex dependencies in dynamic graphs, resembling living organisms' intricate and adaptive nature. However, their complexity often leads to large and computationally expensive models. Therefore, pruning unnecessary edges becomes crucial for enhancing efficiency without significantly compromising performance. This paper presents a dynamic pruning method based on graph spectral theory, leveraging the imaginary component of the eigenvalues of the network graph's Laplacian.

eigenvalue, matrix, node, (14 more...)

2502.17522

Country:

Europe > Denmark > Capital Region > Copenhagen (0.14)
Europe > Sweden > Stockholm > Stockholm (0.04)

Genre: Research Report (0.65)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Bhaskar, Adithya, Wettig, Alexander, Friedman, Dan, Chen, Danqi

Finding Transformer Circuits with Edge Pruning

arXiv.org Artificial IntelligenceJun-24-2024

The path to interpreting a language model often proceeds via analysis of circuits -- sparse computational subgraphs of the model that capture specific aspects of its behavior. Recent work has automated the task of discovering circuits. Yet, these methods have practical limitations, as they rely either on inefficient search algorithms or inaccurate approximations. In this paper, we frame automated circuit discovery as an optimization problem and propose *Edge Pruning* as an effective and scalable solution. Edge Pruning leverages gradient-based pruning techniques, but instead of removing neurons or components, it prunes the \emph{edges} between components. Our method finds circuits in GPT-2 that use less than half the number of edges compared to circuits found by previous methods while being equally faithful to the full model predictions on standard circuit-finding tasks. Edge Pruning is efficient even with as many as 100K examples, outperforming previous methods in speed and producing substantially better circuits. It also perfectly recovers the ground-truth circuits in two models compiled with Tracr. Thanks to its efficiency, we scale Edge Pruning to CodeLlama-13B, a model over 100x the scale that prior methods operate on. We use this setting for a case study comparing the mechanisms behind instruction prompting and in-context learning. We find two circuits with more than 99.96% sparsity that match the performance of the full model and reveal that the mechanisms in the two settings overlap substantially. Our case study shows that Edge Pruning is a practical and scalable tool for interpretability and sheds light on behaviors that only emerge in large models.

edge pruning, edge sparsity, sparsity, (12 more...)

2406.16778

Country:

South America > Colombia > Meta Department > Villavicencio (0.04)
Europe > Latvia > Lubāna Municipality > Lubāna (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Rajaram, Achyuta, Chowdhury, Neil, Torralba, Antonio, Andreas, Jacob, Schwettmann, Sarah

Automatic Discovery of Visual Circuits

arXiv.org Artificial IntelligenceApr-22-2024

To date, most discoveries of network subcomponents that implement humaninterpretable computations in deep vision models have involved close study of single units and large amounts of human labor. We explore scalable methods for extracting the subgraph of a vision model's computational graph that underlies recognition of a specific visual concept. We introduce a new method for identifying these subgraphs: specifying a visual concept using a few examples, and then tracing the interdependence of neuron activations across layers, or their functional connectivity. We find that our approach extracts circuits that causally affect model output, and that editing these circuits can defend large pretrained models from adversarial attacks. Our code and data are available at https://github.com/

activation, intermediate concept, neuron, (16 more...)

2404.14349

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report (0.50)

Industry:

Information Technology (0.35)
Government (0.35)
Transportation > Ground > Road (0.31)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)